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Abstract 

The code that combines channel estimation and error protection has received general attention 
recently, and has been considered a promising methodology to compensate multi-path fading effect. It has 
been shown by simulations that such code design can considerably improve the system performance over 
the conventional design with separate channel estimation and error protection modules under the same 
code rate. Nevertheless, the major obstacle that prevents from the practice of the codes is that the existing 
codes are mostly searched by computers, and hence exhibit no good structure for efficient decoding. 
Hence, the time-consuming exhaustive search becomes the only decoding choice, and the decoding 
complexity increases dramatically with the codeword length. In this paper, by optimizing the signal-to- 
noise ratio, we found a systematic construction for the codes for combined channel estimation and error 
protection, and confirmed its equivalence in performance to the computer-searched codes by simulations. 
Moreover, the structural codes that we construct by rules can now be maximum-likelihoodly decodable in 
terms of a newly derived recursive metric for use of the priority-first search decoding algorithm. Thus, 
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the decoding complexity reduces significantly when compared with that of the exhaustive decoder. 
The extension code design for fast-fading channels is also presented. Simulations conclude that our 
constructed extension code is robust in performance even if the coherent period is shorter than the 
codeword length. 

Index Terms 

Code design, Priority-first search decoding, Training codes, Time-varying multipath fading channel, 
Channel estimation, Channel equalization, Error-control coding 

I. Introduction 

The new demand of wireless communications in recent years inspires a quick advance in 
wireless transmission technology. Technology blossoms in both high-mobility low-bit-rate and 
low-mobility high-bit-rate transmissions. Apparently, the next challenge in wireless communica- 
tions would be to reach high transmission rate under high mobility. The main technology obstacle 
for high-bit-rate transmission under high mobility is the seemingly highly time-varying channel 
characteristic due to movement; such a characteristic further enforces the difficulty in compen- 
sating the intersymbol interference. Presently, a typical receiver for wireless communications 
usually contains separate modules respectively for channel estimation and channel equalization. 
The former module estimates the channel parameters based on a known training sequence or 
pilots, while the latter module uses these estimated channel parameters to eliminate the channel 
effects due to multipath fading. However, the effectiveness in channel fading elimination for 
such a system structure may be degraded at a fast time-varying environment, which makes 
high-bit-rate transmission under high-mobility environment a big challenge. 

Recent researches [3][6][1 1][17][18] have confirmed that better system performance can be 
obtained by jointly considering a number of system devices, such as channel coding, channel 
equalization, channel estimation, and modulation, when compared with the system with individ- 
ually optimized devices. Specially, some works on combining devices of codeword decision and 
channel effect cancellation in typical receivers can appropriately exclude channel estimation labor 
and still perform well. In 1994, Seshadri [17] first proposed a blind maximum-likelihood sequence 
estimator (MLSE) in which the data and channel are simultaneously estimated. Skoglund et al 
[18] later provided a milestone evidence for the fact that the joint design system is superior 
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in combating with serious multipath block fading. They also applied similar technique to a 
multiple-input-multiple-output (MIMO) system at a subsequent work [6]. In short, Skoglund 
et al looked for the non-linear codes that are suitable for this channel by computer search. 
Through simulations, they found that the non-linear code that combines channel estimation 
and error protection, when being carefully designed by considering multipath fading effect, 
outperforms a typical communication system with perfect channel estimation by at least 2 dB. 
Their results suggest the high potential of applying a single, perhaps non-linear, code to improve 
the transmission rate at a highly mobile environment, at which channel estimation becomes 
technically infeasible. Similar approach was also proposed by [3], and the authors actually 
named such codes the training codes. In [2], Chugg and Polydoros derived a recursive metric 
for joint maximum-likelihood (ML) decoding, and hint that the recursive metric may only be 
used with the sequential algorithms [1]. As there are no efficient decoding approaches for the 
codes mentioned above, these authors mostly considered only codes of short length, or even just 
the principle of code design for combined channel estimation and error protection. 

One of the drawbacks of these combined-channel-estimation-and-error-protection codes is that 
only exhaustive search can be used to decode their codewords due to lack of systematic structure. 
Such drawback apparently inhibits the use of the codes for combined channel estimation and error 
protection in practical applications. This leads to a natural research query on how to construct 
an efficiently decodable code with channel estimation and error protection functions. 

In this work, the research query was resolved by first finding that the codeword that maximizes 
the system signal-to-noise ratio (SNR) should be orthogonal to its delayed counterpart. We then 
found that the code consists of the properly chosen self-orthogonal codewords can compete 
with the computer- searched codes in performance. With this self-orthogonality property, the 
maximum-likelihood metrics for these structural codewords can be equivalently fit into a recursive 
formula, and hence, the priority-first search decoding algorithm can be employed. As a conse- 
quence, the decoding complexity, as compared to the exhaustive decoding, reduces considerably. 
Extensions of our proposed coding structure that was originally designed for channels with 
constant coefficients to channels with varying channel coefficients within a codeword block are 
also established. Simulations conclude that our constructed extension code is robust even for a 
channel whose coefficients vary more often than a coding block. 

The paper is organized as follows. Section [H] describes the system model considered, followed 
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by the technical backgrounds required in this work. In Section [Till the coding rule that optimizes 
the system SNR is established, and is subsequently used to construct the codes for combined 
channel estimation and error protection. The corresponding recursive maximum-likelihood de- 
coding metrics for our rule-based systematic codes are derived in Section [IV] Simulations are 
summarized and remarked in Section |V] Extension to channels with varying coefficients within 
a codeword is presented in Section |VIJ Section IVIII concludes the paper. 

In this work, superscripts "H" and "T" specifically reserve for the representations of matrix 
Hermitian transpose and transpose operations, respectively [10], and should not be confused with 
the matrix exponent. 



II. Background 
A. System model and maximum-likelihood decoding criterion 

The system model defined in this section and the notations used throughout follow those in 
[18]. 

Transmit a codeword b = • • • , b^Y , where each bj E {±1}, of a (N,K) code C over 
a block fading (specifically, quasi-static fading) channel of memory order (P — 1). Denote the 
channel coefficients by h = [hi, - ■ ■ hp] T that are assumed constant within a coding block. The 
complex- valued received vector is then given by: 



y = Mh + n, 



(1) 



where n is zero-mean complex-Gaussian distributed with E[nn H ] = <t%Il, II is the L x L 
identity matrix, and 



h 

i h 

b N : 

b N 









bi 

b N 



J LxP 

Some assumptions are made in the following. Both the transmitter and the receiver know 
nothing about the channel coefficients h, but have the knowledge of multipath parameter P 
or its upper bound. Besides, there are adequate guard period between two encoding blocks so 
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that zero interblock interference is guaranteed. Based on the system model in (OQ) and the above 
assumptions, we can derive [18] the least square (LS) estimate of channel coefficients h for a 
given b (interchangeably, B) as: 

h = (B T My 1 M T y, 

and the joint maximum-likelihood (ML) decision on the transmitted codeword becomes: 

b = argmin \\y — Mh\\ 2 = arg min \\y — Pb?/|| 2 , (2) 
bac bac 

where F B = B(B r B) -1 B r . Notably, codeword b and transformed codeword F B is not one-to-one 

corresponding unless the first element of b, namely b\, is fixed. For convenience, we will always 

set bi — — 1 for the codebooks we construct in the sequel. 

B. Summary of previous and our code designs for combined channel estimation and error 
protection 

In literatures, no systematic code constructions have been proposed for combined channel 
estimation and error protection for quasi-static fading channels. Efforts were mostly placed 
on how to find the proper sequences to compensate the channel fading by computer searches 
[3][6][13][14][18][19][21]. Decodability for the perhaps structureless computer- searched codes 
thus becomes an engineering challenge. 

In 2003, Skoglund, Giese and Parkvall [18] searched by computers for nonlinear binary block 
codes suitable for combined estimation and error protection for quasi-static fading channels by 
minimizing the sum of the pairwise error probabilities (PEP) under equal prior, namely, 

. 2 K 2 K 

P ^^H Pi(b = b(J)\b(i) transmitted), (3) 

where b(i) denotes the ith codeword of the (N, K) nonlinear block code. Although the operating 
signal-to-noise ratio (SNR) for the code search was set at 10 dB, their simulation results showed 
that the found codes perform well in a wide range of different SNRs. In addition, the mismatch 
in the relative powers of different channel coefficients, as well as in the channel Rice factors 
[20], has little effect on the resultant performance. It was concluded that in comparison with the 
system with the benchmark error correcting code and the perfect channel estimator, significant 
performance improvement can be obtained by adopting their computer- searched nonlinear codes. 
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Later in 2005, Coskun and Chugg [3] replaced the PEP in © by a properly defined pairwise 
distance measure between two codewords, and proposed a suboptimal greedy algorithm to speed 
up the code search process. In 2007, Giese and Skoglund [6] re-applied their original idea to the 
single- and multiple- antenna systems, and used the asymptotic PEP and the generic gradient- 
search algorithm in place of the PEP and the simulated annealing algorithm in [18] to reduce 
the system complexity. 

At the end of [18], the authors pointed out that "an important topic for further research is to 
study how the decoding complexity of the proposed scheme can be decreased." They proceeded 
to state that along this research line, "one main issue is to investigate what kind of structure 
should be enforced on the code to allow for simplified decoding." 

Stimulating from these ending statements, we take a different approach for code design. 
Specifically, we pursued and established a systematic code design rule for combined channel 
estimation and error protection for quasi-static fading channels, and confirmed that the codes 
constructed based on such rule maximize the average system SNR. As so happened that the 
computer- searched code in [18] satisfies such rule, its insensitivity to SNRs, as well as channel 
mismatch, somehow finds the theoretical footing. Enforced by the systematic structure of our 
rule-based constructed codes, we can then derive a recursive maximum-likelihood decoding 
metric for use of priority-first search decoding algorithm. The decoding complexity is therefore 
significantly decreased at moderate-to-high SNRs as contrary to the obliged exhaustive decoder 
for the structureless computer-searched codes. 

It is worth mentioning that although the codes searched by computers in [6] [18] target the 
unknown channels, for which the channel coefficients are assumed constant in a coding block, 
the evaluation of the PEP criterion does require to presume the knowledge of channel statistics. 
The code constructed based on the rule we proposed, however, is guaranteed to maximize the 
system SNR regardless of the statistics of the channels. This hints that our code can still be well 
applied to the situation where channel blindness becomes a strict system restriction. Details will 
be introduced in subsequent sections. 

C. Maximum-likelihood priority-first search decoding algorithm 

For a better understanding, we give a short description of a code tree for the (N, K) code 
C over which the decoding search is performed before our describing the priority-first search 
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decoding algorithm in this subsection. 

A code tree of a (N, K) binary code represents every codeword as a path on a binary tree 
as shown in Fig. [Q The code tree consists of (N + 1) levels. The single leftmost node at level 
zero is usually called the origin node. There are at most two branches leaving each node at each 
level. The 2 K rightmost nodes at level N are called the terminal nodes. 

Each branch on the code tree is labeled with the appropriate code bit b{. As a convention, 
the path from the single origin node to one of the 2 A terminal nodes is termed the code path 
corresponding to the codeword. Since there is a one-to-one correspondence between the codeword 
and the code path of C, a codeword can be interchangeably referred to by its respective code 
path or the branch labels that the code path traverses. Similarly, for any node in the code tree, 
there exists a unique path traversing from the single original node to it; hence, a node can also 
be interchangeably indicated by the path (or the path labels) ending at it. We can then denote the 
path ending at a node at level i by the branch labels [61, 62, • ' " j M ^ traverses. For convenience, 
we abbreviates [61 , 62 5 • • * > be] T as b(l), an d will drop the subscript when i = N. The successor 
pathes of a path bu) are those whose first I labels are exactly the same as bay 
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Fig. 1. The code tree for a computer-searched PEP-minimum (4,2) code with 61 fixed as — 1. 
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The priority-first search on a code tree is guided by an evaluation function / that is defined 
for every path. It can be typically algorithmized as follows. 

Step 1 . (Initialization) Load the Stack with the path that ends at the original node. 

Step 2. (Evaluation) Evaluate the f -function values of the successor paths of the current top 

path in the Stack, and delete this top path from the Stack. 
Step 3. (Sorting) Insert the successor paths obtained in Step^into the Stack such that the paths 

in the Stack are ordered according to ascending / -function values of them. 
Step 4. (Loop) If the top path in the Stack ends at a terminal node in the code tree, output the 

labels corresponding to the top path, and the algorithm stops; otherwise, go to Step® 
It remains to find the evaluation function / that secures the maximum-likelihoodness of the 
output codeword. We begin with the introduction of a sufficient condition under which the above 
priority-first search algorithm guarantees to locate the code path with the smallest /-function 
value among all code paths of C. 

Lemma 1: If / is non-decreasing along every path bm in the code tree, i.e., 



the priority-first search algorithm always outputs the code path with the smallest /-function 
value among all code paths of C. 

Proof: Let b* be the first top path that reaches a terminal node (and hence, is the output 
code path of the priority-first search algorithm.) Then, Step 3 of the algorithm ensures that / (&*) 
is no larger than the /-function value of any path currently in the Stack. Since condition © 
guarantees that the /-function value of any other code path, which should be the offspring of 
some path bm existing in the Stack, is no less than / we have 



In the design of the search-guiding function /, it is convenient to divide it into the sum of 
two parts. In order to perform maximum-likelihood decoding, the first part g can be directly 
defined based on the maximum-likelihood metric of the codewords such that from ©, 




(4) 



/ (&*) < / (6 W ) < 



mm 



/(*)■ 



Consequently, the lemma follows. 



'gming(b) = argmin \\y — P_bJ/ 
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After g is defined, the second part h can be designed to validate © with h(b) = for any 
b E C. Then, from f(b) = g{b) + h{b) = g(b) for all b e C, the desired maximum-likelihood 
priority-first search decoding algorithm is established since dU is valid. 

In principle, both g(-) and h(-) range over all possible paths in the code tree. The first part, 
g(-), is simply a function of all the branches traversed thus far by the path, while the second 
part, h(-), called the heuristic function, helps predicting a future route from the end node of the 
current path to a terminal node [8]. Notably, the design of the heuristic function h that makes 
valid condition © is not unique. Different designs may result in variations in computational 
complexity. 

We close this section by summarizing the target of this work based on what have been 
mentioned in this section. 

1) A code of comparable performance to the computer- searched code is constructed accord- 
ing to certain rules so that its code tree can be efficiently and systematically generated 
(Section HI]). 

2) Efficient recursive computation of the maximum-likelihood evaluation function / from the 
predecessor path to the successor paths is established (Section HVl) . 

3) With the availability of items Q] and [2l the construction and maximum-likelihood decoding 
of codes with longer codeword length becomes possible, and hence, makes the assumption 
that the unknown channel coefficients h are fixed during a long coding block somewhat 
impractical especially for mobile transceivers. Extension of items \T\ and [2] to the unknown 
channels whose channel coefficients may change several times during one coding block 
will be further proposed (Section IVTl). 

III. Code Construction 

In this section, the code design rule that guarantees the maximization of the system SNR 
regardless of the channel statistics is presented, followed by the algorithm to generate the code 
based on such rule. 



February 2, 2008 



DRAFT 



10 



A. Code rule that maximizes the average SNR 

A known inequality [15] for the multiplication of two positive semidefinite Hermitian matrices, 
A and B, is that 



tr(AB) < tr(A) ■ A r 



(5) 



where tr(-) represents the matrix trace operation, and A max (B) is the maximal eigenvalue of 
[10]. The above inequality holds with equality when B is an identity matrix. 

From the system model y = Mh + n, it can be derived that the average SNR satisfies: 

E\\\Mh\\ 2 ] 



Average SNR 



E[\\n\\*\ 

E[tr(h H M T Bh)} 

tr(E[hh H ]M T M) 
La~l 

= - L ki E[hhH]l N l 

< ^LL tx ( E [hh H ])\ ms 

L at 



I] 



Then, the theories on Ineq. © result that taking 



N 



1 
1 





(6) 



PxP 



will optimize the average SNR regardless of the statistics of h [5]. 

Existence of codeword sequences satisfying © is promised only for P = 2 with odd (and 
trivially, P = 1). In some other cases such as P = 3, one can only design codes to approximately 
satisfy © as: 
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and 



N 



1 





± 
1 



1 

N 




for N odd. 



Owing to this observation, we will relax © to allow some off-diagonal entries in B T B to be 
either 1 or —1 whenever a strict maintenance of © is impossible. 

Empirical examination by simulated-annealing code-search algorithm shows that for 4 < N < 
16 and N even, the best half-rate codes that minimize the sum of PEPs in © undej] complex 
zero-mean Gaussian distributed h with E[hh H ] = (l/2)Ip and P = 2 all satisfy that 

N ±1 

(7) 

±1 N 

except three codewords at N = 14. A possible cause for the appearance of three exception 
codewords at N = 14 is that the best code that minimizes the sum of the pairwise error 
probabilities may not be the truly optimal code that reaches the smallest error probability, and 
hence, does not necessarily yield the maximum average SNR as demanded by ©. We have 
also obtained and examined the computer- searched code used in [18] for N = 22, and found as 
anticipated that every codeword carries the property of £7J). 

The operational meaning of the condition B T B = N ■ lp is that the codeword is orthogonal to 
its shifted counterpart, and hence, a space-diversity nature is implicitly enforced. This coincides 
with the conclusion made in [4] that the training sequence satisfying that B T B is proportional to 
Ip can provide optimal channel estimation performance. It should be mentioned that codeword 
condition © has been identified in [6], and the authors in [6, pp. 1591] remarked that a code 
sequence with certain aperiodic autocorrelation property can possibly be exploited in future code 
design approaches, which is one of the main research goals of this paper. 



B. Equivalent system model for combined channel estimation and error protection codes 

By noting^ that F B is idempotent and symmetric, and both tr(P#) and ||vec(Ps)|| 2 are equal 
to P, where vec(-) denotes the operation to transform an (M x N) matrix into a (MN x 1) 

'The adopted statistical parameters of h follow those in [18]. 

2 The validity of the claimed statement here does not require the SNR-optimization condition B T B = Nip. 
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Fig. 2. Equivalent system model for combined channel estimation and error protection codes. 



vector^ the best joint maximum-likelihood decision in © can be reformulated as: 

b = argmin \\y — ¥ B y\\ 2 
bee 

= argmin(2/ - ¥ B y) H (y - ¥ B y) 

= &rgmm(y H y - y H F B y) 
bee 

= argmin —tr(P B yy H ) (8) 

= argmin (||vec(i/y^)|| 2 - vec(P B ) T vec(yy H ) - vec(yy H ) H vec(P B ) + ||vec(P B )|| 2 ) 

= argmin \\vec(yy ) — vec(Ps)|| 2 - (9) 
bee 

We therefore transform the original system in (OQ) to an equivalent system model that contains 
an outer product demodulator and a minimum Euclidean distance selector at the P^-domain as 
shown in Fig. [21 As the outer product demodulator can be viewed as a generalization of the 
square-law combining that is of popular use in non-coherent detection for both slow and fast 
fading [16], the above equivalent transformation suggests a potential application of combined 
channel estimate and error protection codes for the non-coherent system in which the fading is 
rapid enough to preclude a good estimate of the channel coefficients. Further discussion on how 
to design codes for unknown fast-fading channels will be continued in Section [VTl 



3 



vec(A) for a matrix A is defined as: 
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vec(A) = voc 



ai.i oi,2 • • • ai.s 
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As a consequence of ©, the maximum-likelihood decoding is to find the codeword ¥ B whose 
Euclidean distance to yy H is the smallest. Similar to ©, we can then bound the error probability 
by: 

P - ^ ^5Z5Z Pr (ll vec (^) " vec(P Ba) )|| 2 <||vec( 2/2 / H ) - vec(PB (l) )|| 2 | b{i) transmitted) . 

i=l 3=1 

(10) 

The PEP-based upper bound in (flOl) hints that a good code design should have an adequately 
large pairwise Euclidean distance 

||vec(P B(0 )-vec(P B y))|| 2 (11) 

among all codeword pairs IPs(i) and Psy), where Ps(i) is the equivalent codeword at the P#- 
domain, and is one-to-one corresponding to the codeword b(i) if b\ is fixed and known. Based 
on this observation, we may infer under equal prior that a uniform draw of codewords satisfying 
B T B = N ■ Ip at the P^-domain may asymptotically result in a good code. In light of the one- 
to-one correspondence relation between b and Pb, we may further infer that uniform selection 
of codewords in the set of 

A = {¥ B = B(B T B)^B : 36 £ {±1} N such that B T B = N-I P } 

is conceptually equivalent to uniform-pick of codewords in {b £ {±1}^ : B T B = NI P }. 

Recall that in order to perform the priority-first search decoding on a code tree, an efficient 

and systematic way to generate the code tree (or more specifically, an efficient and systematic 

way to generate the successor paths of the top path) is necessary. The uniform pick principle 

then suggests that considering only the codewords with the same prefix [b±, ■■ ■ ,be], the ratio 

of the number of codewords satisfying bi + i = — 1 with respect to the candidate sequence pool 

shall be made equal to that of codewords satisfying bi + ± = 1, whenever possible. This can be 

mathematical interpreted as: 

\CjhM,--- ,b e ,+l)\ ^ \C(b u b 2 ,--- ,6,, -1)| 
\A(h,b 2 ,--- ,b e ,+l\N-I P )\ ~ \A(h,h,--- ,b t ,-l\N-I P )y 

where C(bm) is the set of all codewords whose first i bits equal b±, b 2 , ■ ■ ■ , bi, and A(b^)\G) is 

the set of all possible ±l-sequences of length N, whose first £ bits equal b±, b 2 ,--- ,bg and whose 
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B-representation satisfies B T B = G. Accordingly, if |^4(6(^|G)| can be computed explicitly, the 
desired efficient and systematic generation of the code tree becomes straightforward. 

Simulations on the above uniform selective code over {b e {±1}^ : B T B = Nip} show 
that its performance is almost the same as the computer-searched code that minimizes the sum 
of PEPs. Hence, the maximizing-the-pairwise-Euclidean-distance intuition we adopt for code 
design performs well as we have anticipated. 

In the next subsection, we will provide an exemplified encoding algorithm based on the above 
basic rule specifically for channels of memory order 1, namely, P = 2. The encoding algorithm 
for larger memory order can be similarly built. 



C. Exemplified encoding algorithm for channels of memory order one 

Before the presentation of the exemplified encoding algorithm, the explicit formula for \ A(b^ 
needs to be established first^ 



Lemma 2: Fix P = 2. Then, for N odd, and G = N ■ Ip, 

N-t 



WWI = ( {N _ t _ m)/2 ) 1 {Kl < N - £} for 1 < £ < N, 



where 1{-} is the set indicator function, and rri£ = (6i& 2 
In addition, for N even, and 



bi-ib e ) ■ 1{£ > 1}. 



N -1 
-1 N 



and G 2 



N 1 
1 N 



\A(b {i) \G e )\ 



N -£ 

XN-£+(-l) 9 - m e )/2 
Here, we assume that Q = 1 specifically for the case of £ = N 
Proof: The lemma requires 



jl{\{-l) e -mi\ <N -£} ioxl<£<N. 



c = bib 2 + b 2 b 3 H h b N -ib N = m e + b e bi +1 H h &at-i&at, 



(13) 



where c = 0, —1, +1 respectively for G, Gi and G 2 - hi order to satisfy (TT3T) . there should be 
(N — £ + c — mi)/ 2 of {bebe+i, be + ibi +2 , ■ ■ ■ , 6at_i6at} equal to 1, and the remaining of them 



4 |.4(6(£) |G)| may not have an explicit close-form formula for memory order higher than one. However, our encoding algorithm 



can still be applied as long as |«4(6(,q|G)| can be pre-calculated (cf. Appendix). 
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equal - 1, provided that (N - £ + c - m e )/2 > and (N - £) - (N - £ + c - m e )/2 > 0. Notably, 
(N — £ + c — me) is always an even number for the cases considered in the lemma. The proof 
is then completed by the observation that [bebi+i, bi + ibe +2 , • • ■ , &at_i&tv] and [fe^+i, h +2 , ■ ■ ■ , fejv] 
are one-to-one correspondence for given b e . ■ 

It is already hint in the above lemma that for N odd, our encoding algorithm will uniformly 
pick 2 K codewords from the candidate sequences satisfying the exact SNR-maximization con- 
dition M T M = N ■ Ip. However, for N even, two conditions on candidate sequences will be 
used. Half of the codewords will be uniformly drawn from those candidate sequences satisfying 
B T B = Gi, and the other half of the codewords agree with B T B = G 2 . The proposed codeword 
selection process is simply to list all the candidate sequences in binary-alphabetical order, starting 
from zero, and uniformly pick the codewords from the ordered list in every A interval, where 

|^ 1 = -1|H)|-1 

I 2 K /e - 1 

where HI represents the desired B T B, and 6 is the number of conditions and equals 1 for TV 
odd, and 2 for iV even. As a result, the selected codewords are those sequences with index i x A 
for integer i. The encoding algorithm is summarized in the following. 

Step 1. (Input) Let i be the index of the requested codeword in the desired (N, K) block code, 
whereO < i < 2 K - 1. 

Step 2. (Initialization) Set = 1 for N odd, and 2 for N even. Let b\ = —1. Put (in terms of 
the notations in Lemma\2$): 



J, if N odd; 

h, ifN even and0<i < 2 K ~ 1 ; 
h, ifN even and 2 K ~ 1 <i <2 K . 



Compute 



A 



\A(h\M)\-l 



2 K /Q-1 

Also, re-adjust i = i - 2 K ~ 1 if N is even and 2 K ~ 1 < i < 2 K . 

Let the minimum candidate sequence index p min and the maximum candidate sequence 
index p max in A(bx\M) be respectively 



Pmir 



and p r 



|^(6i|H)|-l. 
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Initialize £ = 1 and p = i x A. 
Step 3. (Generation of the next code bit) 

Set£ = £+1, and compute 7 = |^4.(t» (£ „ 1) , -1|H)|. 

If p < p min + 7, then the next code bit b e = — 1, and re-adjust p max = p min + 7 — 1; 
else, the next code bit bg = \, and re-adjust p mhl = p min + 7. 
Step 4. (Loop) If £ = N, output codeword b, and the algorithm stops; otherwise, go to Step\3\ 

IV. Maximum-Likelihood Priority-First Search Decoding of Combined 
Channel Estimation and Error Protection Codes 

In this section, a recursive maximum-likelihood metric g and its heuristic function h for use 
of the priority-first search decoding algorithm to decode the structural codewords over multiple 
code trees are established. 

A. Recursive maximum-likelihood metric g for priority-first search over multiple code trees 

Let Cq be the set of the codewords that satisfy B T B = Gg, where 1 < 9 < ©, and assume that 
C = Ui<$<eCe, and Cg fl C v = whenever 9 ^ 77. Then, by denoting for convenience = G# , 
we can continue the derivation of the maximum-likelihood criterion from © as: 



e 




9=1 



8 




e 



arg min > 
bee ^ 

9=1 



tr ([(B <g> B)vec(D e )] T vec(^))] 1{6 e Cg} 



e 



arsr mm > 
bee ^ 

0=1 



tr (vec(D e ) r (B T g> B T )vec(2/^))] 1{6 G C e } 



e 



arg min > 
bee ^ 

9=1 



tr ((B <g> B) T vec(i/^)vec(D e ) T )] l{b e Cg}, 



(14) 
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where "®" is the Kronecker product, and 1{-} is the set indicator function that has been used 
in Lemma |2] Defining 



E 






0-- 








1 


0'- 











r • 











o- • 


i 






and c 



J LxL 



1>N 







L J Lxl 



we get: 

B®B 



c ® 1 (Ec) <g> B ■ ■ • (E p_1 c) ® : 
c <g> c c ® (Ec) • • • c <g> (E p_1 c) (Ec) (8) c • • • (E p ~ 1 c) <g> (E p ~ 1 c 



vec(cc i ) vec((Ec)c J ) •■■ vec((E^~ i c)c i ) vec(c(Ec) 



vec((E p - 1 c)(E p - 1 c) p ) 



which indicates that the ith column of B ® B, where 2 = 0,1,---,P 2 — 1, can be written as 
vec ((E imodP c)(EL i / p J C ) T ) . Here, we adopt E°c = c by convention. 
Resume the derivation in (PT4l) by denoting the matrix entry of 0# by 5, 







arg mm 
bee 



arg mm 
bee 



in > 

-.r 



E 



© 

arg min > 
bGC ^— ' 

0=1 





p-i p-i 

"EE O ec (( E " c ) (E J c) T ) r vec(yy*) 

i=0 j=0 

p-1 p-1 

-EEC )tr (( EJc )( rc ) T ^ 

i=0 j=0 

-EE0 r (( Ei )V E ^ r ) 

i=0 j=0 



(*). 

l{6eC*} 
l{6eC e } 
l{b e C fl } 



arg mm [-tr(W e cc T )j 1{6 G C e }, 



where 



p-i p-i 



i=0 j=0 
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We then conclude: 

b = arg min [-vec(W e )^vec(cc T )l l{b E C e } 
bee ^— ' L 



e 



e 



arg min 

[-vec(W e ) H vec(cc T ) - vec(cc T ) T vec(W 9 )] l{b e C e } 



arg mm 
6 bee 2 



1 ° 



N N 



E E (-<nbmb n ) 



m=l n=l 



l{beC e }, 



(15) 



where w m 'n is the real part of the entry of W#, and is given by: 

p-i p-i 



(0) 



EE4i Re ^-^n +J }- 



j=0 j=0 

The maximum-likelihood decision remains unchanged by adding a constant, independent of the 
codeword b; hence, a constant is added to make non-negative the decision criterion asjfl 



A 



f m—1 



arg mm < > max > 
bee ] ^— ' i<r?<e \ 

. m=l \n=l 



I + -\ w (n) I 

"^m,nl 1 2 I m,m\ 



1 



AT AT 



E E W ™,nbmbr, 



e 

arg min > 
bee ^— ' 

9=1 



A' 



'm—1 



E^ x ji>SJ + 9> 



m=l n=l 

AT N 



<»?<© 
m=l \n=l 



W I 

m,m I 



9 E E W ™,nbmb r , 



It remains to prove that the metric of 

N /m-l 



m=l n=l 



N N 



l{b e C e } 
l{b G C e }. 



V max ( V \wjSl\ + -Iw^ 

1 <7)<B I 



1<»7<6 
m=l \n=l 



m=l n=l 



can be computed recursively for b E Ce. 
Define for path bm over code tree 6 that 



'm—1 



} W 4 Effi E 



m=l 



W K'I) \ + -\ w 

. n=l 



fa) I 
m,m I 



1 ^ £ 



m=l n=l 



5 Here, a non-negative maximum-likelihood criterion makes possible the later definition of path metric g(ba\) non-decreasing 
along any path in the code tree. A non-decreasing path metric has been shown to be a sufficient condition for priority-first search 



to guarantee to locate the codeword with the smallest path metric [8] [9]. It can then be anticipated (cf. Section ITV-Bt that letting 
the heuristic function be zero for all paths in the code tree suffices to result in an evaluation function satisfying the optimal 
condition l[4j in Lemma Q] 

Notably, the additive constant that makes the evaluation function non-decreasing along any path in the code tree can also 
be obtained by first defining g based on l |15t , and then determining its respective h according to ©■ Such an approach 
however makes complicate the determination of heuristic function h when the system constraint that the evaluation function is 
recursive-computable is additionally required. The alternative approach that directly defines a recursive-computable g based on 
a non-negative maximum-likelihood criterion is accordingly adopted in this work. 



February 2, 2008 



DRAFT 



19 



Then, by the symmetry that Wm,n = Wn]m for 1 < m, n < N and 1 < 9 < 0, we have that for 



1 <£< N- 1, 
g(b {e+1) ) = g{b (l) ) + max ( 



Av) 



W £+l,n\ ' 2 \ W £+1,£+1 I 



P-1P-1 



g(b {e) ) + max - b i+1 ^ ^ 5^Re {y e+l+1 ■ Uj(b {e+1) )} 

i=0 j=0 



1 1 

(17) 



n=l 



where 



a 



i+1 ~ 



1 1 

El (v) i | i (r?) 



mi 



n=l 



= £ 

n=l 

and for < j < P - 1, 



p-i p-i 



i=0 j=0 



p-1 p-1 



j=0 j=0 



(18) 



A 1 1 

- E 6 "^+j + = + o ( & ^m + & m2/m+i) • 



n=l 



This implies that we can recursively compute g(b^ e+1 )) and {n J (b^ +1 ))} < : ,<p_ 1 from the pre- 
vious g{b(i)) and {uj(b {e ))}f~^ with the knowledge of y e+1 , y e+2 , ■ ■ ■ , y e+P and and the 
initial condition satisfies that g(b^) = Uj(b( \) — bo — for < j < P — 1. 

A final remark in this discussion is that although the computation burden of in ( fl"8l 
increases linearly with I, such a linearly growing load can be moderately compensated by the 
fact that of 1 is only necessary to compute it once for each i and 77, because it can be shared 
for all paths ending at level £ over code tree 77. 

B. Heuristic function h that validates © 

Taking the maximum-likelihood metric g into the sufficient condition in © yields that: 

" it 



Emax c 
i<«<© 

m=l 



2 



m=l n=l 

A? 



< min 



TV N 



max a 

<»?<© 



(»/) 



2 



_m=l m=l n=l 

Hence, in addition to h(b) = 0, the heuristic function should satisfy: 



A 1 



Hb W )< E 



m=£+l 



max air — max 

l<r><0 



A" £ ^ N N \ 

E ^E W -^« + 2 E E ^Sn^Jn . 
7i=^+l ra=l m =f+ln=f+l / 

(19) 
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Apparently, a function that guarantees to satisfy (1191) is the zero-heuristic function, that is, 
hi(b(£)) = for any path bm in the code trees. Adopting the zero-heuristic function h\, together 
with the recursively computable maximum-likelihood metric g in (fl6l) . makes feasible the on- 
the-fly priority-first search decoding. In comparison with the exhaustive-checking decoding, 
significant improvement in the computational complexity is resulted especially at medium-to-high 
SNRs. 

In situation when the codeword length N is not large such as N < 50 so that the demand 
of on-the-fly decoding can be moderately relaxed, we can adopt a larger heuristic function to 
further reduce the computational complexity. Upon the reception of all yi, ■ ■ ■ , y^, the heuristic 
function that satisfies (IT9T ) regardless of b e+ i, ■ ■ ■ , b N can be increased up to: 



N N 



h 2 (b { e)) = max 

y ' t-^ i<n<e ^— ' 



<»?<© 

m=£+l m=£+l 



/ j w rn,n u n 



n=l 



1 N N 

2 / J / j \ m,n\ 
m=£+l n=£+l 



N N 

™ e «S ) - E K?(*m)I-& w » (20) 



m=£+l m=i+l 

where for 1 < L m < N and 1 < 9 < 6, 



n=l 

and 



AT ' m-1 , 



E E + E 



,W| 1 L..(»)| 



n=£+l 

with initially vfn (&(o)) = &o = 0, and /3q = J^ m=1 a™ . Simulations show that when being 
compared with the zero-heuristic function hi, the heuristic function in (1201) further reduces the 
number of path expansions during the decoding process up to one order of magnitude (cf. Tab. HI 
in which fi — g + hi — g and f2 — g + ^2-)- 

A final note on the priority-first search of the maximum-likelihood codeword is that in those 
cases that equality in © cannot be fulfilled, codewords will be selected equally from multiple 
code trees, e.g., one code tree structured according to B r B = Gi, and the other code tree 
targeting M T M = G2 for N even and P = 2. Since the transmitted codeword belongs to only 
one of the code trees, to maintain individual Stack for the codeword search over each code 
tree will introduce considerable unnecessary decoding burdens especially for the code trees that 
the transmitted codeword does not belong to. Hence, only one Stack is maintained during the 
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priority-first search, and the evaluation function values for different code trees are compared and 
sorted in the same Stack. The path to be expanded next is therefore the one whose evaluation 
function value is globally the smallest. 

V. Simulation Results 

In this section, the performance of the rule-based constructed codes proposed in Section [III] is 
examined. Also illustrated is the decoding complexity of the maximum-likelihood priority-first 
search decoding algorithm presented in the previous section. For ease of comparison, the channel 
parameters used in our simulations follow those in [18], where h is complex zero-mean Gaussian 
distributed with E[hh H ] = (l/P)Ip and P = 2. The average system SNR is thus given by: 

Average SNR = ^tr (b[W^B*b) = f £tr = (Jy + p_ 1) ^. CO 

since tr (B T B) = NP for all codewords simulated^ 

There are three codes simulated in Fig. [3l the computer- searched half-rate code obtained 
in [18] (SA-22), the rule-based double-tree code in which half of the codewords satisfying 
B T B = Gi and the remaining half satisfying B T B = G2 (Double-22), and the rule-based single- 
tree code whose codewords are all selected from the candidate sequences satisfying M T M = Gi 
(Single-22). We observe from Fig. [3] that the Double-22 code performs almost the same as 
the SA-22 code obtained in [18] at N = 22. Actually, extensive simulations in Fig. |4] show 
that the performance of the rule-based double-tree half -rate codes is as good as the computer- 
searched half-rate codes for all N > 12. However, when N < 12, the approximation in (fl"2l) 
can no longer be well maintained due to the restriction that |^4.(bm|G)| must be an integer, and 
an apparent performance deviation between the rule-based double-tree half-rate codes and the 
computer- searched half-rate codes can therefore be sensed for N below 12. 

6 The authors in [18] directly define the channel SNR as \j<j\. It is apparent that their definition is exactly the limit of d2 1 b 
as N approaches infinity. 

Since it is assumed that adequate guard period between two encoding blocks exists (so that there is no interference between 
two consecutive decoding blocks), the computation of the system SNR for finite TV should be adjusted to account for this muting 
(but still part-of-the-decoding-block) guard period. For example, in comparison of the (6,3) and (20,10) codes over channels 
with memory order 1 (i.e., P — 2), one can easily observe that the former can only transmit 18 code bits in the time interval 
of 21 code bits, while the latter pushes out up to 20 code bits in the period of the same duration. Thus, under fixed code bit 
transmission power and fixed component noise power ofj, it is reasonable for the (20,10) code to result a higher SNR than the 
(6,3) code. 
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-e- Single-22 
-*- Double-22 
— I — SA-22 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

SNR (dB) 

Fig. 3. The maximum-likelihood word error rates (WERs) of the half-rate computer-searched code by simulated annealing in 
[18] (SA-22), the rule-based half-rate code with double code trees (Double-22), and the rule-based half-rate code with single 
code tree (Single-22). The codeword length is N — 22. 

In addition to the Double-22 code, the performance of the Single-22 code is also simulated 
in Fig. [3l Since the pairwise codeword distance in the sense of (fTTj) for the Single-22 code is 
in general smaller than that of the Double-22 code, its performance has 0.2 dB degradation to 
that of the Double-22 code. However, we will see in later simulation that the Single-22 code has 
the smallest decoding complexity among the three codes in Fig. [3l This suggests that to select 
codewords uniformly from a single code tree should not be ruled out as a candidate design, 
especially when the decoding complexity becomes the main system concern. 

In Fig. |H the average numbers of node expansions per information bit are illustrated for the 
codes examined in Fig. [3l Since the number of nodes expanded is exactly the number of tree 
branch metrics (i.e., one recursion of /-function values) computed, the equivalent complexity 
of exhaustive decoder is correspondingly plotted. It can then be observed that in comparison 
with the exhaustive decoder, a significant reduction in computational burdens can be obtained at 
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5 10 15 

SNR (dB) 




5 10 15 

SNR (dB) 




Fig. 4. The maximum-likelihood word error rates (WERs) of the computer-searched half-rate code by simulated annealing 
(SA-JV) and the rule-based half-rate code with double code trees (Double- A''). 



moderate-to-high SNRs by adopting the Double-22 code and the priority-first search decoder with 
on-the-fly evaluation function f±, namely, g (cf. Eq. (flTT)). Further reduction can be approached if 
the Double-22 code is replaced with the Single-22 code. The is because performing the sequential 
search over multiple code trees introduce extra node expansions for those code trees that the 
transmitted codeword does not belong to. An additional order-of-magnitude reduction in node 
expansions can be achieved when the evaluation function / 2 = g + h 2 is used instead. 

The authors in [3] and [18] only focused on the word-error-rates (WERs). No bit error rate 
(BER) performances that involve the mapping design between the information bit patterns and 
the codewords were presented. Yet, in certain applications, such as voice transmission and 
digital radio broadcasting, the BER is generally considered a more critical performance index. 
In addition, the adoption of the BER performance index, as well as the signal-to-noise ratio per 
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10 



10 




^ SEQ-Single-22-f 1 
, SEQ-Double-22-f 2 
SEQ-Single-22-f 



10 



11 



12 



13 



14 



15 



SNR (dB) 



Fig. 5. The average numbers of node expansions per information bit for the computer-searched code in [18] by exhaustive 
decoding (EXH-SA-22), and the rule-based single-tree (SEQ-Single-22) and double-tree (SEQ-Double-22) codes using the 
priority-first search decoding guided by either evaluation function /i or evaluation function /2. 



information bit, facilitates the comparison among codes of differen code rates. 

Figure [6] depicts the BER performances of the codes simulated in Fig. [3] The corresponding 
E b /N is computed according to: 

E b /N = i ■ SNR, 
K 

where R = K/N is the code rate. The mapping between the bit patterns and the codewords of 
the given computer-searched code is obtained through simulated annealing by minimizing the 
upper bound of: 

BER £^£ £ D(m(i) K m(])) ?r{h = bU)\b(i) transmitted), 

i=l j=l,j^i 

where, other than the notations defined in ©, m(i) is the information sequence corresponding 
to i-th codeword, and D(-,-) is the Hamming distance. For the rule-based constructed codes 
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12 13 14 15 16 17 



Fig. 6. Bit error rates (BERs) for the codes simulated in Fig. [3] 



in Section IIII-CL the binary representation of the index of the requested codeword in Step \T\ is 
directly taken as the information bit pattern corresponding to the requested codeword. The result 
in Fig. [6] then indicates that the BER performances of the three curves are almost the same, 
which directs the conclusion that taking the binary representation of the requested codeword 
index as the information bit pattern for the rule-based constructed code not only makes easy its 
implementation but also has similar BER performance to the computer-optimized codes. 

In the end, we demonstrate the WER and BER performances of Single-26, Double-26, Single- 
30, Double-30 codes, together with those of Single-22 and Double-22 codes, over the quasi-static 
fading channels respectively in Figs. [7] and [8l Both figures show that the Double-30 code has the 
best maximum-likelihood performance not only in WER but also in BER. This result echoes the 
usual anticipation that the performance favors a longer code as long as the channel coefficients 
remain unchanged in a coding block. Their decoding complexities are listed in Tab. U from 
which we observe that the saving of decoding complexity of metric f 2 with respect to metric f± 
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-e- Double-30 
-*- Single-30 
Double-26 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

SNR (dB) 

Fig. 7. Word error rates (WERs) for the codes of Single-22, Double-22, Single-26, Double-26, Single-30 and Double-30. 

increases as the codeword length further grows. 

VI. Codes for channels with fast fading 

In previous sections, also in [2], [6] and [18], it is assumed that the channel coefficients h 
are invariant in each coding block of length L = N + P — 1. In this section, we will show that 
the approaches employed in previous sections can also be applicable to the situation that h may 
change in every Q symbol, where Q < L. 

For 1 < k < M = \L/Q~\, let h k = [h\^ h 2:k • • • h Pk ] T be the constant channel coefficients 
at the fcth sub-block. Denote by b k = [b(k-i)Q-p+2 ■ • ■ &(fc-i)Q+i ■ ■ ■ b k Q] T the portion of b, 
which will affect the output portion y k = [y^ k _^ Q+1 y( k -i) Q+2 ••• Ukol, where we assume 
bj = for j < and j > N for notational convenience. Then, for a channel whose coefficients 
change in every Q symbol, the system model defined in (OQ) remains as y = Mh + n except that 
both y and n extend as MQ x 1 vectors with yj = nj = for j > L, and B and h have to be 
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Fig. 8. Bit eiTor rates (BERs) for the codes of Single-22, Double-22, Single-26, Double-26, Single-30 and Double-30. 



re-defined as 



>1 w in>2 



My 



and h 



h? /if ■■■ h 



H 



where M k = [Oq x (p-i) Iq][^ ^b k ■ ■ ■ E F l b k ] is a Q x P matrix, 



E 



(Q+P-l)x(Q+P-l) 

and "©" is the direct sum operator of two matrices [_ 






0-- 








1 


0'- 











r- 











0- • 


1 






7 For two matrices A and B, the direct sum of A and B is defined as 
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TABLE I 

Average numbers of node expansions per information bit for the priority-first search decoding of the 

constructed half-rate codes of length 22, 26 and 30. 



SNR 


5dB 


6dB 


7dB 


8dB 


9dB 


lOdB 


lldB 


12dB 


13dB 


14dB 


15dB 


Double-22-/i 
Double-22-/ 2 


671 

68 


590 
55 


506 
42 


436 
32 


375 
26 


320 
20 


274 
17 


236 
14 


204 
12 


178 
10 


156 
9 


ratio of fx/ / 2 


9.8 


10.7 


12.0 


13.6 


14.4 


16.0 


16.1 


16.8 


17.0 


17.8 


17.3 


Double-26-/i 
Double-26-/ 2 


2361 
175 


2006 
130 


1695 
94 


1416 
69 


1189 
53 


981 
39 


813 
29 


677 
23 


523 
18 


499 
15 


392 
13 


ratio of fx/ fi 


13.5 


15.4 


18.0 


20.5 


22.4 


25.2 


28.0 


29.4 


29.1 


33.3 


30.2 


Double-30-/i 
Double-30-/ 2 


8455 
459 


7073 
332 


5760 
232 


5133 
166 


3759 
119 


3430 
86 


2644 
60 


1996 
44 


1765 
33 


1368 
25 


1081 
20 


ratio of fx/ fi 


18.4 


21.3 


24.8 


30.9 


31.6 


39.9 


44.1 


45.4 


53.4 


54.7 


54.1 


Single-22-/i 
Single-22-/ 2 


460 
45 


371 

33 


308 
26 


250 
20 


200 
15 


163 
12 


130 
10 


105 

8 


85 
7 


69 
6 


57 
5 


ratio of fx / f% 


10.2 


11.2 


11.8 


12.5 


13.3 


13.5 


13.0 


13.1 


12.1 


11.5 


11.4 


Single-26-/i 
Single-26-/ 2 


1635 
112 


1328 
79 


1061 
57 


839 
42 


666 
31 


522 
23 


403 
17 


312 
13 


244 
11 


191 

9 


152 
7 


ratio of fx/ / 2 


14.6 


16.8 


18.6 


20.0 


21.5 


22.7 


23.7 


23.9 


22.2 


21.2 


21.7 


Single-30-/i 
Single-30-/ 2 


5871 
284 


4695 
199 


3857 
144 


2924 
101 


2335 
72 


1813 
51 


1328 
35 


884 
26 


805 
18 


572 
14 


416 
11 


ratio of fx/ /a 


20.6 


23.6 


26.8 


29.0 


32.4 


35.5 


38.0 


34.0 


44.7 


40.9 


37.8 



Based on the new system model, we have F B = P,^ © P_b 2 © ■ ■ • © Pb m , where F Bk = 
l fc (B£l fc ) _1 Bj[, and Eq. © becomes: 

M 

b = argmax^ ||vec(y fc yf ) - vec(P B j|| (22) 

k=l 

Again, codeword b and transformed codeword P# is not one-to-one corresponding unless the 
first element of b, namely b\, is fixed[] 

Since B r B = (if li) © (Bfl 2 ) © • ■ ■ © (B^B M ), the maximization of system SNR can be 

8 It can be derived that given Q > P and B^B fc = G k for 1 < k < M, 

b Q . P+2 = bx x (_l)W-**+i-rp.r-i.i)/a 

bkQ-p+2 = b {h _x)Q-p+2 x (_i)«-w».*»-i,*)/ a far* = 2,.-. ,M-1 
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achieved simply by assigning 



BfBi = B^B 2 = • ■ • = B^B M = Q-I q (23) 

if such assignment is possible. Due to the same reason mentioned in Section ITlI-A[ approximation 
to (|23T) will have to be taken in the true code design. 

It remains to determine the number of all possible ±l-sequences of length N, whose first I 
bits equal b\, b 2 , . . ., bi> subject to Bp? fe = G fe for 1 < k < M. 



Lemma 3: Fix P = 2 and Q > P, and put 



Q ci 



<3 c fc 



for 2 < Jfe < M - 1, 



and 



Bj/Bjv/ 



(24) 



N - (M - l)Q c M 

c M N-(M-1)Q+1 

where in our code selection process, [ci,c 2 ,--- , cm] £ {0,±1} M will be chosen such that 
Q — l + ci, Q + Ck for 2 < k < M — 1, and AT — (M — 1)Q + cm are all even. Then, the number 
of all possible ±l-sequences of length N, whose first £ bits equal b\, b 2 , ■ ■ ., b e subject to (|24|) 
is given by: 



Q-(£ mod Q) 

Q—(£ mod Q)+c T —mi 
2 



' M-l 

n 

k=r+l 



Q 

? + Cfc + l 



N-(M-1)Q\ 
jv-(m-i)q+c m 1 {|cr - < g - mod Q)} 



AT - (M - 1)Q 

N -(M-i) Q+ cJ- mi ) 1 {\ c m -m e \<N- (M - 1)Q} , 



for 1 < t < M; 
for r = M 



where r = |//<2J + 1> and 
0. 

&!& 2 H h 

&(t-1)Q&(t-1)Q+1 + 



£ = 1 or (£ = (r - 1)Q and 2 < r < M); 

1< £ < Q; 

6/_i6/, (r - 1)Q < £ < rQ and 2 < r < M. 



where ji t j,k is the (i, j)th entry of the symmetric matrix Gfe for 1 < i,j < P, and, in our setting, r yp i p-i i k £ {0, ±1} should 
be chosen to make the exponent of ( — 1) an integer. Therefore, the first bit in each is fixed once 6i is set, which indicates 
that with the knowledge of bi, codeword b can be uniquely determined by transformed codeword Pb. 
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Proof: It requires 

c x = b x b 2 



Ct 



b(r-i)Qb(r-i)Q+i H h H h b T Q^ib T 



Q 



m e + b(bn + i 



Q 



X 



X 



1 {\c k - m t \ < k Q 

H-ll < X 

i{|cm| <iv-(M-i)g} 



X '(Q + l0/2l 1{|C "-^ e> 



Cm — &(M-1)Q&(M-1)Q+1 + ■ • • + b^-lbN 

Following the same argument as in Lemma [21 we obtain that the number of all possible ±1- 
sequences of length N, whose first £ bits equal b\, b 2 , ■ ■ ., h subject to (1241) is given by: 

kQ 

XkQ -£ + c k -m e )/2 / 

Q 

XQ + c fe+ i)/2 y 

N — (M — 1)Q 
XN-{M-l)Q + c M )/2, 
The proof is completed by noting that 
N - (M -1)Q are always valid. ■ 
With the availability of the above lemma, the code construction algorithm in Section IIII-CI can 
be performed. Next, we re-derive the maximum-likelihood decoding metric for use of priority- 
first search decoding algorithm. Continuing the derivation from ((22)) based on Bf = Gg^ for 
1 < k < M and 1 < 9 < 0, we can establish in terms of similar procedure as in Section IIV-AI 
that: 

M Q+P-1Q+P-1 



(r - 1)Q + (imodQ), \c k \ < Q and \c M \ < 



^ ^ [" 



o = arg mm 
bee . 

k=l m=l n=l 

where for 1 < m, n < Q + P — 1, 



-w 



(0) 



,b<k 



,n,k U {k-l)Q-P+m+lU(k-l)Q-P+n+l 



l{b € C e } 



p-i p-i 



w 



10) 



m+i,ky n +j l k} ' 



8< f],k is the (^i)th entrj^of D e ,fc = Gj, and y k = [Oi x (p-i) Vk 0ix(P~i)] H = [vi 



i=0 j=0 



k ■ ■ ■ 



-2P-2M 



9 Under the assumption that Q > P, the ith diagonal element of the target Ge.i is given by Q—i+1, and the diagonal elements 
of the target Ge,k are equal to Q for 2 < k < M; hence, their inverse matrices exist. However, when P > N — (M — 1)Q, 
Ge,M has no inverse. In such case, we re-define Ho,m as: 

De,M = 0[N-(M-1)Q]X[N-(M-1)Q] © G 6iM (N — (M — 1)Q + 1), 
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As it turns out, the recursive on-the-fiy metric for the priority-first search decoding algorithm is: 

p-i p-i 

!<^e a ^ k ~ 5 fik Re iVs+i,k ■ u j>k (b (e) )}, for P < s < Q 



i=0 j=0 



g{b {l) ) - g(b (t -i)) = < 



p-i p-i 



max 0$ + max Q ag +1 — 6* ( CW^-* ' u i.*( 6 w)} 

~ r '~ _T?_ i=o i=o 

+ 4i!fc+i Re {^+i,fc+i • %,fc+i( b w)} ) , otherwise. 



where -P + 2 < £ < N, s = [{£ + P - 2) mod Q] + 1, r = s + Q, k = max{ \£/Q] , 1}, 



s-1 



,fa) A 



,fa) 



a ^k = E \ W s,n,k 
n=l 



W 



fa) 
s,s,k 



and 



u 



f ,fc(6(<+i)) = u jtk (b(e)) + - (b e y* s+j>k + b e+1 y* +j+l k ) 



with initial values g(br_ P+ i\) = and Uj^{bia._ 1 \Q_ P ^2)) — for < j < P — 1 and 1 < k < 
M. In addition, the low-complexity heuristic function is given by: 

/ Q+P-l Q+P-l 



E 



max a 



fa) 

m,k 



l<r)<e 

m=s+l m=s+l 
M /Q+P-l 



e 



s,fe 



+ E E max^-<n, forP< S <Q; 



i<»7<e 

K=fc+1 \ m=l 
-P-l Q+P-l 



^2 (&(/)) - < 



E 

m=s+l 



max a 

i<r/<e 

Q+P-l 



fa) 

m,fc+l 



E K^^w) 



m=s+l 



0. 



-P-l 



+ E 

m=r4 
Af 

+ E ( E 



fa) 

max c - 



1<»?<0 

m=r+l 

M /Q+P-l 



max 

l<rj<8 



E 

m=r+l 



(0) 

s,fc+l 



r J r,k 



otherwise, 



K=k+2 \ m=l 

where s, r and /c are defined the same as for g(-), 



- E W ™k^(fc-l)Q+P+n-l = (&(/-!)) +^i,fe^> 



71=1 



and 



Q + P-l / 771-1 

ft = E E 

m=s+l \n=s+l 



m,n,k 



W 



(9) 

m,m,k 



Q+P-l 



E 



(0) 

s,n,k 



n=s+l 



W 



s,s,k 



where Gs,m(j) is a (-P — 3 + 1) X {P — j + 1) matrix that contains the jth to Pth rows and the jth to Pth columns of Ge 
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Fig. 9. Word error rates (BERs) for the codes of Double-28, SA-14, Single-28(Q=15) and Double-28(Q=15) over the quasi-static 
channel with Qchan = 15. 



with initial values v^ k (b {k _ 1)Q _p +2 ) = and $g = J2 

It is worth mentioning that if the single-tree code is adopted, /^O) can be further reduced to: 



(0) 



yQ+P-l n (0) 



m=l 



m.k ' 



Q+P-l 



-P-l 



E 

m=s+l 
Q+P-l 

E 

m=s+l 



a 



(1) 
m.k 



E 



KM 



for P < s < Q; 



a 



(i) 

m,fc+l 



since Xlmtf ~* maxi<^< fl d$ K - (3q I = J]m=i _± a ™'* ~ Pq,' k = 0; hence, a sub-blockwise low- 
complexity on-the-fly decoding can indeed be conducted under the single code tree condition. 

Figures [9] and [10] compare four codes over fading channels whose channel coefficients vary in 
every 15-symbol period. Notably, we will use Q c han to denote the varying period of the channel 
coefficients h, and retain Q as the design parameter for the nonlinear codes. In notations, 



Q+P-l 

+ E 

m=r+l 

M 



m=s+l 
Q+P-l 

E 

m=s+l 



v m,k+i( b w] 



-3 {1) 

h's.k+l 



Q+P-l 

E 

m=r+l 



(3^1 otherwise, 
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E b /N (dB) 

Fig. 10. Bit error rates (BERs) for the codes of Double-28, SA-14, Single-28(Q=15) and Double-28(Q=15) over the quasi-static 
channel with Q c han = 15. 

"Double-28" and "SA-14" denote the codes defined in the previous sections, and "Single- 
28(Q=15)" and "Double-28(Q=15)" are the codes constructed based on the rule introduced in 
this section under the design parameter Q = 15. Again, the mapping between the bit patterns 
and codewords for the SA-14 code is defined by simulated annealing. 

Both Figs. [9] and \\0\ show that the Double-28 code seriously degrades when the channel 
coefficients unexpectedly vary in an intra-codeword fashion. This hints that the assumption that 
the channel coefficients remain constant in a coding block is critical in the code design in Section 
Hill Figures [TT] and [[2] then indicate that the codes taking into considerations the varying nature 
of the channel coefficients within a codeword is robust in its performance when being applied to 
channels with constant coefficients. Thus, we may conclude that for a channel whose coefficients 
vary more often than a coding block, it is advantageous to use the code design for a fast-fading 
environment considered in the section. 

A more striking result from Fig. [9] is that even if the codeword length of the Single-28(Q=15) 
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Double-28 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

SNR (dB) 

Fig. 11. Word error rates (BERs) for the codes of Double-28, SA-14, Single-28(Q=15) and Double-28(Q=15) over the 
quasi-static channel with Q c h an > 29. 

and the Double-28(Q=15) codes is twice of the SA-14 code, their word error rates are still 
markedly superior at medium-to-high SNRs. Note that the SA-14 code is the computer-optimized 
code specifically for Qchan = 15 channel. This hints that when the channel memory order is 
known, performance gain can be obtained by considering the inter-subblock correlation, and 
favors a longer code design. 

The decoding complexity, measured in terms of average number of node expansions per 
information bit, for codes of Single-28(Q=15) and Double-28(<5=15) are illustrated in Fig. [T3l 
Similar observation is attained that the decoding metric f?, yields less decoding complexity than 
the on-fhe-fly decoding one fx. 

VII. Conclusions 

In this paper, we established the systematic rule to construct codes based on the optimal signal- 
to-noise ratio framework that requires every codeword to satisfy a "self-orthogonal" property to 



February 2, 2008 



DRAFT 



35 



-0- Double-28 
-*- SA-14 

Single-28(Q=15) 
Double-28(Q=15) 




Fig. 12. Bit error rates (BERs) for the codes of Double-28, SA-14, Single-28(Q=15) and Double-28(Q=15) over the quasi-static 
channel with Q c han > 29. 



combat the multipath channel effect. Enforced by this structure, we can then derive a recursive 
maximum-likelihood metric for the tree-based priority-first search decoding algorithm, and hence, 
avoid the use of the time-consuming exhaustive decoder that was previously used in [3], [6], 
[18] to decode the structureless computer-optimized codes. Simulations demonstrate that the 
ruled-based codes we constructed has almost identical performance to the computer-optimized 
codes, but its decoding complexity, as anticipated, is much lower than the exhaustive decoder. 

Moreover, two maximum-likelihood decoding metrics were actually proposed. The first one 
can be used in an on-the-fly fashion, while the second one as having a much less decoding 
complexity requires the knowledge of all channel outputs. The trade-off between them is thus 
evident from our simulations. 

Extensions of the code design to a fast-varying quasi-static environment is added in Section 
fVTl Although we only derive the coding rule and its decoding metric for a fixed Q, further 
extension to the situation that the channel coefficients h vary non-stationarily as the periods 
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SEQ-Double-28(Q=15)-f 1 
SEQ-Single-28(Q=15)-f 1 
SEQ-Double-28(Q=15)-f 2 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

SNR (dB) 



Fig. 13. Average numbers of node expansions per information bit for the codes of Single-28(Q=15) and Double-28(Q=15) 
using the priority-first search decoding guided by either evaluation function /i or evaluation function /2 over the quasi-static 
channel with Q c han = 15. 



Qi, Q2, ■ ■ Qm are not equal is straightforward. Such design may be suitable for, e.g., the 
frequency-hopping scheme of Global System for Mobile communications (GSM) and Universal 
Mobile Telecommunications System (UMTS), and the time-hopping scheme in IS-54, in which 
cases the channel coefficients change (or hop) at protocol-aware scheduled time instants as 
similarly mentioned in [12]. 

A limitation on the code design we proposed is that the decoding complexity grows exponen- 
tially with the codeword length. This constraint is owning to the tree structure of our constructed 
codes. It will be an interesting and useful future work to re-design the self-orthogonal codes that 
can be fit into a trellis structure, and make them maximum-likelihood decodable by either the 
priority-first search algorithm or the Viterbi-based algorithm. 
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Appendix 



Lemma 4: Fix P > 1 and k > 1. For given integers q = (c/i, g 2 > ■ • • , Qp-i) £ [ — 1 and 
^3-p> • • • , £ {=tl}> l et me number of d\, d,2, ■ • • , <4 that simultaneously satisfy 

k 

Qj = ^ di-jdi for 1 < j < P — 1 

i=l 

be denoted by A k (q\d 2 -p, ■ ■ ■ ,d ). Also, let <G(c) be the P x P matrix of the Toeplitz form: 





N 


Cl 


c 2 


■ Cp_i 




Cl 


A r 


Cl 


• Cp_ 2 




c 2 


Cl 


TV • 


• Cp-3 




Cp-1 


Cp-2 


Cp-3 • 


• N 



where c = (ci, c 2 , • • • , Cp_i) G {±1} P x . Then, for £ = 1, 2, • • • , N, 

\A(b {i) \G(c))\ = A N -t{c — m t \bt-p+2,--- M) ■ 

■ ■ + b t - k b t ) ■ 1{£ > k}. 



(p-i> 



where me = {m^ 1 , • ■ ■ , m) J i; ) and = (&i&fe+i 
Proof: For B T B = G(c), it requires 



Cl 
C-2 



bib 2 + & 2 &3 H h b N -ib N = mf> + b e b i+ i H h fe/y-i&jv 



M3 + M4 H h b N - 2 b N = m) + b^ibg 



1+1 



bN_ 2 b N 



(P— 11 

cp-i = Mp + &2&P+1 H 1" b N ^ P+ ib N = m\ + b e ^ P+2 b e+1 + 

Re- writing the above equations as: 



ci — m 



(i) 



C2 — m 



(2) 



H h &7Y-1&AT 

be-ibe+i + • • ■ + 6at_2&7v 



bN-p+ibN- 



we obtain: 



cp-i — m 



(p-i) 



bi-p+ 2 bi+i + ■ ■ ■ + 6at-p + i& 



jv, 



|^4(b w |G(c))| = Ajv-* (c- m £ |6/_ 



P+2, • • " , 
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It can be easily verified that A k (q\ — d 2 -p, • • • , —do) = A k (q\d2-p, ■ ■ ■ ,d ) since 

j k j k 

Qj = ^(-di-j)di + Yl di ~'J di = J2 d i-j(~ d i) + Yl (-di-j)(-di). 

i=l i=j+\ i=l i=j+l 

Therefore, only 2 P ~ 2 tables are required. The tables of Ak(q\d_p +2 , ■ ■ ■ ,do) for P = 3 and 
1 < k < 5 are illustrated in Table [TT] as an example. 
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TABLE II 

Tables of Ak(q\d-i, do) for P = 3. 
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Decodable Codes for Combined Channel 
Estimation and Error Protection 
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Yunghsiang S. Han, Member, IEEE, and Ming-Hsin Kuo 

Abstract 

The code that combines channel estimation and error protection has received general attention 
recently, and has been considered a promising methodology to compensate multi-path fading effect. It has 
been shown by simulations that such code design can considerably improve the system performance over 
the conventional design with separate channel estimation and error protection modules under the same 
code rate. Nevertheless, the major obstacle that prevents from the practice of the codes is that the existing 
codes are mostly searched by computers, and hence exhibit no good structure for efficient decoding. 
Hence, the time-consuming exhaustive search becomes the only decoding choice, and the decoding 
complexity increases dramatically with the codeword length. In this paper, by optimizing the signal-to- 
noise ratio, we found a systematic construction for the codes for combined channel estimation and error 
protection, and confirmed its equivalence in performance to the computer-searched codes by simulations. 
Moreover, the structural codes that we construct by rules can now be maximum-likelihoodly decodable in 
terms of a newly derived recursive metric for use of the priority-first search decoding algorithm. Thus, 
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the decoding complexity reduces significantly when compared with that of the exhaustive decoder. 
The extension code design for fast-fading channels is also presented. Simulations conclude that our 
constructed extension code is robust in performance even if the coherent period is shorter than the 
codeword length. 

Index Terms 

Code design, Priority-first search decoding, Training codes, Time-varying multipath fading channel, 
Channel estimation, Channel equalization, Error-control coding 

I. INTRODUCTION 

The new demand of wireless communications in recent years inspires a quick advance in 
wireless transmission technology. Technology blossoms in both high-mobility low-bit-rate and 
low-mobility high-bit-rate transmissions. Apparently, the next challenge in wireless communica- 
tions would be to reach high transmission rate under high mobility. The main technology obstacle 
for high-bit-rate transmission under high mobility is the seemingly highly time-varying channel 
characteristic due to movement; such a characteristic further enforces the difficulty in compen- 
sating the intersymbol interference. Presently, a typical receiver for wireless communications 
usually contains separate modules respectively for channel estimation and channel equalization. 
The former module estimates the channel parameters based on a known training sequence or 
pilots, while the latter module uses these estimated channel parameters to eliminate the channel 
effects due to multipath fading. However, the effectiveness in channel fading elimination for 
such a system structure may be degraded at a fast time-varying environment, which makes 
high-bit-rate transmission under high-mobility environment a big challenge. 

Recent researches [?][?][?][?][?] have confirmed that better system performance can be ob- 
tained by jointly considering a number of system devices, such as channel coding, channel 
equalization, channel estimation, and modulation, when compared with the system with individ- 
ually optimized devices. Specially, some works on combining devices of codeword decision and 
channel effect cancellation in typical receivers can appropriately exclude channel estimation labor 
and still perform well. In 1994, Seshadri [?] first proposed a blind maximum-likelihood sequence 
estimator (MLSE) in which the data and channel are simultaneously estimated. Skoglund et al 
[?] later provided a milestone evidence for the fact that the joint design system is superior 
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in combating with serious multipath block fading. They also applied similar technique to a 
multiple-input-multiple-output (MIMO) system at a subsequent work [?]. In short, Skoglund 
et al looked for the non-linear codes that are suitable for this channel by computer search. 
Through simulations, they found that the non-linear code that combines channel estimation 
and error protection, when being carefully designed by considering multipath fading effect, 
outperforms a typical communication system with perfect channel estimation by at least 2 dB. 
Their results suggest the high potential of applying a single, perhaps non-linear, code to improve 
the transmission rate at a highly mobile environment, at which channel estimation becomes 
technically infeasible. Similar approach was also proposed by [?], and the authors actually 
named such codes the training codes. In [?], Chugg and Polydoros derived a recursive metric 
for joint maximum-likelihood (ML) decoding, and hint that the recursive metric may only be 
used with the sequential algorithms [?]. As there are no efficient decoding approaches for the 
codes mentioned above, these authors mostly considered only codes of short length, or even just 
the principle of code design for combined channel estimation and error protection. 

One of the drawbacks of these combined-channel-estimation-and-error-protection codes is that 
only exhaustive search can be used to decode their codewords due to lack of systematic structure. 
Such drawback apparently inhibits the use of the codes for combined channel estimation and error 
protection in practical applications. This leads to a natural research query on how to construct 
an efficiently decodable code with channel estimation and error protection functions. 

In this work, the research query was resolved by first finding that the codeword that maximizes 
the system signal-to-noise ratio (SNR) should be orthogonal to its delayed counterpart. We then 
found that the code consists of the properly chosen self-orthogonal codewords can compete 
with the computer-searched codes in performance. With this self-orthogonality property, the 
maximum-likelihood metrics for these structural codewords can be equivalently fit into a recursive 
formula, and hence, the priority-first search decoding algorithm can be employed. As a conse- 
quence, the decoding complexity, as compared to the exhaustive decoding, reduces considerably. 
Extensions of our proposed coding structure that was originally designed for channels with 
constant coefficients to channels with varying channel coefficients within a codeword block are 
also established. Simulations conclude that our constructed extension code is robust even for a 
channel whose coefficients vary more often than a coding block. 

The paper is organized as follows. Section ?? describes the system model considered, followed 



February 2, 2008 



DRAFT 



4 



by the technical backgrounds required in this work. In Section ??, the coding rule that optimizes 
the system SNR is established, and is subsequently used to construct the codes for combined 
channel estimation and error protection. The corresponding recursive maximum-likelihood de- 
coding metrics for our rule-based systematic codes are derived in Section ??. Simulations are 
summarized and remarked in Section ??. Extension to channels with varying coefficients within 
a codeword is presented in Section ??. Section ?? concludes the paper. 

In this work, superscripts "if" and "T" specifically reserve for the representations of matrix 
Hermitian transpose and transpose operations, respectively [?], and should not be confused with 
the matrix exponent. 



II. Background 
A. System model and maximum-likelihood decoding criterion 

The system model defined in this section and the notations used throughout follow those in 
[?]• 

Transmit a codeword b = [b 1: ■ ■ ■ ,b N ] T , where each bj e {±1}, of a (N,K) code C over 
a block fading (specifically, quasi-static fading) channel of memory order (P — 1). Denote the 
channel coefficients by h = [h±, - ■ • hp] T that are assumed constant within a coding block. The 
complex-valued received vector is then given by: 



y = Mh + n, 



(1) 



where n is zero-mean complex-Gaussian distributed with E[nn H ] = cr^l L , I L is the L x L 
identity matrix, and 

r &i ••• 



b N : 
b N 







b N 



J LxP 

Some assumptions are made in the following. Both the transmitter and the receiver know 
nothing about the channel coefficients h, but have the knowledge of multipath parameter P or 
its upper bound. Besides, there are adequate guard period between two encoding blocks so that 
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zero interblock interference is guaranteed. Based on the system model in (??) and the above 
assumptions, we can derive [?] the least square (LS) estimate of channel coefficients h for a 
given b (interchangeably, B) as: 

h = (B T B) _1 B T y, 

and the joint maximum-likelihood (ML) decision on the transmitted codeword becomes: 

b = argmin \\y — Mh\\ 2 = argmin \\y — ¥ B y\\ 2 , (2) 
bac bac 

where F B = B(B T B) _1 B T . Notably, codeword b and transformed codeword F B is not one-to-one 

corresponding unless the first element of b, namely bi, is fixed. For convenience, we will always 

set bi = — 1 for the codebooks we construct in the sequel. 

B. Summary of previous and our code designs for combined channel estimation and error 
protection 

In literatures, no systematic code constructions have been proposed for combined channel 
estimation and error protection for quasi-static fading channels. Efforts were mostly placed 
on how to find the proper sequences to compensate the channel fading by computer searches 
[?][?][?][?][?][?][?]. Decodability for the perhaps structureless computer- searched codes thus 
becomes an engineering challenge. 

In 2003, Skoglund, Giese and Parkvall [?] searched by computers for nonlinear binary block 
codes suitable for combined estimation and error protection for quasi-static fading channels by 
minimizing the sum of the pairwise error probabilities (PEP) under equal prior, namely, 

P ^^H H Vr(b = b{j)\b(i) transmitted), (3) 

i=l j=l,j^i 

where b(i) denotes the ith codeword of the (N, K) nonlinear block code. Although the operating 
signal-to-noise ratio (SNR) for the code search was set at 10 dB, their simulation results showed 
that the found codes perform well in a wide range of different SNRs. In addition, the mismatch 
in the relative powers of different channel coefficients, as well as in the channel Rice factors 
[?], has little effect on the resultant performance. It was concluded that in comparison with the 
system with the benchmark error correcting code and the perfect channel estimator, significant 
performance improvement can be obtained by adopting their computer- searched nonlinear codes. 
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Later in 2005, Coskun and Chugg [?] replaced the PEP in (??) by a properly defined pairwise 
distance measure between two codewords, and proposed a suboptimal greedy algorithm to speed 
up the code search process. In 2007, Giese and Skoglund [?] re-applied their original idea to the 
single- and multiple- antenna systems, and used the asymptotic PEP and the generic gradient- 
search algorithm in place of the PEP and the simulated annealing algorithm in [?] to reduce the 
system complexity. 

At the end of [?], the authors pointed out that "an important topic for further research is to 
study how the decoding complexity of the proposed scheme can be decreased." They proceeded 
to state that along this research line, "one main issue is to investigate what kind of structure 
should be enforced on the code to allow for simplified decoding." 

Stimulating from these ending statements, we take a different approach for code design. 
Specifically, we pursued and established a systematic code design rule for combined channel 
estimation and error protection for quasi- static fading channels, and confirmed that the codes 
constructed based on such rule maximize the average system SNR. As so happened that the 
computer-searched code in [?] satisfies such rule, its insensitivity to SNRs, as well as channel 
mismatch, somehow finds the theoretical footing. Enforced by the systematic structure of our 
rule-based constructed codes, we can then derive a recursive maximum-likelihood decoding 
metric for use of priority-first search decoding algorithm. The decoding complexity is therefore 
significantly decreased at moderate-to-high SNRs as contrary to the obliged exhaustive decoder 
for the structureless computer- searched codes. 

It is worth mentioning that although the codes searched by computers in [?][?] target the 
unknown channels, for which the channel coefficients are assumed constant in a coding block, 
the evaluation of the PEP criterion does require to presume the knowledge of channel statistics. 
The code constructed based on the rule we proposed, however, is guaranteed to maximize the 
system SNR regardless of the statistics of the channels. This hints that our code can still be well 
applied to the situation where channel blindness becomes a strict system restriction. Details will 
be introduced in subsequent sections. 

C. Maximum-likelihood priority-first search decoding algorithm 

For a better understanding, we give a short description of a code tree for the (N, K) code 
C over which the decoding search is performed before our describing the priority-first search 
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decoding algorithm in this subsection. 

A code tree of a (N, K) binary code represents every codeword as a path on a binary tree as 
shown in Fig. ??. The code tree consists of (N + 1) levels. The single leftmost node at level 
zero is usually called the origin node. There are at most two branches leaving each node at each 
level. The 2 K rightmost nodes at level iV are called the terminal nodes. 

Each branch on the code tree is labeled with the appropriate code bit hi. As a convention, 
the path from the single origin node to one of the 2 K terminal nodes is termed the code path 
corresponding to the codeword. Since there is a one-to-one correspondence between the codeword 
and the code path of C, a codeword can be interchangeably referred to by its respective code 
path or the branch labels that the code path traverses. Similarly, for any node in the code tree, 
there exists a unique path traversing from the single original node to it; hence, a node can also 
be interchangeably indicated by the path (or the path labels) ending at it. We can then denote the 
path ending at a node at level t by the branch labels [61, b 2 , ■ ■ ■ , be] it traverses. For convenience, 
we abbreviates [b 1: b 2: • • • , be] T as by), and will drop the subscript when i = N. The successor 
pathes of a path by) are those whose first i labels are exactly the same as by). 
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Fig. 1. The code tree for a computer- searched PEP-minimum (4, 2) code with b\ fixed as — 1. 
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The priority-first search on a code tree is guided by an evaluation function / that is defined 
for every path. It can be typically algorithmized as follows. 

Step 1 . (Initialization) Load the Stack with the path that ends at the original node. 

Step 2. (Evaluation) Evaluate the f -function values of the successor paths of the current top 

path in the Stack, and delete this top path from the Stack. 
Step 3. (Sorting) Insert the successor paths obtained in Step ?? into the Stack such that the 

paths in the Stack are ordered according to ascending f -function values of them. 
Step 4. (Loop) If the top path in the Stack ends at a terminal node in the code tree, output the 

labels corresponding to the top path, and the algorithm stops; otherwise, go to Step ??. 
It remains to find the evaluation function / that secures the maximum-likelihoodness of the 
output codeword. We begin with the introduction of a sufficient condition under which the above 
priority-first search algorithm guarantees to locate the code path with the smallest /-function 
value among all code paths of C. 

Lemma 1: If / is non-decreasing along every path in the code tree, i.e., 

f(b (e) )< min f(b), (4) 

jbeC : b(£)=b(f)} 

the priority-first search algorithm always outputs the code path with the smallest /-function 
value among all code paths of C. 

Proof: Let b* be the first top path that reaches a terminal node (and hence, is the output 
code path of the priority-first search algorithm.) Then, Step 3 of the algorithm ensures that / (&*) 
is no larger than the /-function value of any path currently in the Stack. Since condition (??) 
guarantees that the /-function value of any other code path, which should be the offspring of 
some path b(g) existing in the Stack, is no less than / (by)), we have 

f(b*)<f(b {£) )< min /(&). 

{bee : 6(£)=6(^)| 

Consequently, the lemma follows. ■ 

In the design of the search-guiding function /, it is convenient to divide it into the sum of 
two parts. In order to perform maximum-likelihood decoding, the first part g can be directly 
defined based on the maximum-likelihood metric of the codewords such that from (??), 

argming(b) = argmin \\y — F B y\\ 2 - 
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After g is defined, the second part h can be designed to validate (??) with h(b) = for any 
b E C. Then, from f(b) = g(b) + h(b) = g(b) for all b e C, the desired maximum-likelihood 
priority-first search decoding algorithm is established since (??) is valid. 

In principle, both g(-) and h(-) range over all possible paths in the code tree. The first part, 
g(-), is simply a function of all the branches traversed thus far by the path, while the second 
part, h(-), called the heuristic function, helps predicting a future route from the end node of the 
current path to a terminal node [?]. Notably, the design of the heuristic function h that makes 
valid condition (??) is not unique. Different designs may result in variations in computational 
complexity. 

We close this section by summarizing the target of this work based on what have been 
mentioned in this section. 

1) A code of comparable performance to the computer- searched code is constructed accord- 
ing to certain rules so that its code tree can be efficiently and systematically generated 
(Section ??). 

2) Efficient recursive computation of the maximum-likelihood evaluation function / from the 
predecessor path to the successor paths is established (Section ??). 

3) With the availability of items ?? and ??, the construction and maximum-likelihood de- 
coding of codes with longer codeword length becomes possible, and hence, makes the 
assumption that the unknown channel coefficients h are fixed during a long coding block 
somewhat impractical especially for mobile transceivers. Extension of items ?? and ?? to 
the unknown channels whose channel coefficients may change several times during one 
coding block will be further proposed (Section ??). 

III. Code Construction 

In this section, the code design rule that guarantees the maximization of the system SNR 
regardless of the channel statistics is presented, followed by the algorithm to generate the code 
based on such rule. 
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A. Code rule that maximizes the average SNR 

A known inequality [?] for the multiplication of two positive semidefinite Hermitian matrices, 
A and B, is that 



tr(AB) < tr(A) • A n 



(5) 



where tr(-) represents the matrix trace operation, and A max (B) is the maximal eigenvalue of 
[?]. The above inequality holds with equality when B is an identity matrix. 

From the system model y = Mh + n, it can be derived that the average SNR satisfies: 

E[\\mr] 



Average SNR 



E[\\n\\*} 



E[tv(h H M T Eh)} 



L 



tr(E[hh 



H]tu>Tt, 



Lai 

- - L vA E[hh " ] > Ti 

< ^^-t r (E[hh H ])X ma 
L at 



I] 

N 



Then, the theories on Ineq. (??) result that taking 

"l 

1 



N 



1 







(6) 



PxP 



will optimize the average SNR regardless of the statistics of h [?]. 

Existence of codeword sequences satisfying (??) is promised only for P = 2 with N odd (and 
trivially, P = 1). In some other cases such as P = 3, one can only design codes to approximately 
satisfy (??) as: 
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